Text-Based Age and Gender Prediction for Online Safety Monitoring
نویسندگان
چکیده
This paper explores the capabilities of text-based age and gender prediction geared towards the application of detecting harmful content and conduct on social media. More specifically, we focus on the use case of detecting sexual predators who try to “groom” children online and possibly provide false age and gender information in their user profiles. We perform age and gender classification experiments on a dataset of nearly 380,000 Dutch chat posts from a social network. We evaluate and compare binary age classifiers trained to separate younger and older authors according to different age boundaries and find that macro-averaged Fscores increase when the age boundary is raised. Furthermore, we show that use-case applicable performance levels can be achieved for the classification of minors versus adults, thereby providing a useful component in a cybersecurity monitoring tool for social network moderators.
منابع مشابه
A Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملNeuro-Fuzzy Based Algorithm for Online Dynamic Voltage Stability Status Prediction Using Wide-Area Phasor Measurements
In this paper, a novel neuro-fuzzy based method combined with a feature selection technique is proposed for online dynamic voltage stability status prediction of power system. This technique uses synchronized phasors measured by phasor measurement units (PMUs) in a wide-area measurement system. In order to minimize the number of neuro-fuzzy inputs, training time and complication of neuro-fuzzy ...
متن کاملOnline Monitoring for Industrial Processes Quality Control Using Time Varying Parameter Model
A novel data-driven soft sensor is designed for online product quality prediction and control performance modification in industrial units. A combined approach of time variable parameter (TVP) model, dynamic auto regressive exogenous variable (DARX) algorithm, nonlinear correlation analysis and criterion-based elimination method is introduced in this work. The soft sensor performance validation...
متن کاملOnline Voltage Stability Monitoring and Prediction by Using Support Vector Machine Considering Overcurrent Protection for Transmission Lines
In this paper, a novel method is proposed to monitor the power system voltage stability using Support Vector Machine (SVM) by implementing real-time data received from the Wide Area Measurement System (WAMS). In this study, the effects of the protection schemes on the voltage magnitude of the buses are considered while they have not been investigated in previous researches. Considering overcurr...
متن کاملAuthor gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کامل